Skip to content

feat: add IncrementalFileCleanup strategy and dispatch in ExpireSnapshots::Finalize#648

Merged
wgtmac merged 15 commits into
apache:mainfrom
shangxinli:pr-a-incremental-cleanup
May 20, 2026
Merged

feat: add IncrementalFileCleanup strategy and dispatch in ExpireSnapshots::Finalize#648
wgtmac merged 15 commits into
apache:mainfrom
shangxinli:pr-a-incremental-cleanup

Conversation

@shangxinli
Copy link
Copy Markdown
Contributor

Mirrors Java's IncrementalFileCleanup for the linear-ancestry case: each manifest is attributed to its writer snapshot, so two passes are enough instead of the full reachability scan. Cherry-pick protection via SnapshotSummaryFields::kSourceSnapshotId is preserved.

Finalize() now picks IncrementalFileCleanup when the expiration is "simple" (no explicit snapshot IDs, no removed snapshots outside the current main ancestry, and no retained snapshots outside the current main ancestry), and falls back to ReachableFileCleanup otherwise. The dispatch matches Java RemoveSnapshots.cleanExpiredSnapshots.

Two existing cleanup tests (DeletesExpiredFiles, IgnoresExpiredDeleteManifestReadFailures) used an empty current manifest list, which is an unreachable-orphan scenario that only ReachableFileCleanup can resolve. They now call ExpireSnapshotId() to force the reachable path, which keeps their original intent and matches Java behavior. New tests cover both dispatch branches.

shangxinli and others added 10 commits April 22, 2026 16:43
Implement the file cleanup logic that was missing from the expire
snapshots feature (the original PR noted "TODO: File recycling will
be added in a followup PR").

Port the "reachable file cleanup" strategy from Java's
ReachableFileCleanup, following the same phased approach:

Phase 1: Collect manifest paths from expired and retained snapshots
Phase 2: Prune manifests still referenced by retained snapshots
Phase 3: Find data files only in manifests being deleted, subtract
         files still reachable from retained manifests (kAll only)
Phase 4: Delete orphaned manifest files
Phase 5: Delete manifest lists from expired snapshots
Phase 6: Delete expired statistics and partition statistics files

Key design decisions matching Java parity:
- Best-effort deletion: suppress errors on individual file deletions
  to avoid blocking metadata updates (Java suppressFailureWhenFinished)
- Branch/tag awareness: retained snapshot set includes all snapshots
  reachable from any ref (branch or tag), preventing false-positive
  deletions of files still referenced by non-main branches
- Data file safety: only delete data files from manifests that are
  themselves being deleted, then subtract any files still reachable
  from retained manifests (two-pass approach from ReachableFileCleanup)
- Respect CleanupLevel: kNone skips all, kMetadataOnly skips data
  files, kAll cleans everything
- FileIO abstraction: uses FileIO::DeleteFile for filesystem
  compatibility (S3, HDFS, local), with custom DeleteWith() override
- Statistics cleanup via snapshot ID membership in retained set

TODOs for follow-up:
- Multi-threaded file deletion (Java uses Tasks.foreach with executor)
- IncrementalFileCleanup strategy for linear ancestry optimization
  (Java uses this when no branches/cherry-picks involved)
- Fix O(M*S) I/O: Pre-cache ManifestFile objects in manifest_cache_ during
  Phase 1 (ReadManifestsForSnapshot), eliminating repeated manifest list
  reads in FindDataFilesToDelete.

- Fix storage leak: Use LiveEntries() instead of Entries() to match Java's
  ManifestFiles.readPaths behavior (only ADDED/EXISTING entries).

- Fix data loss risk: When reading a retained manifest fails, abort data
  file deletion entirely instead of silently continuing. Java retries and
  throws on failure here.

- Fix statistics file deletion: Use path-based set difference instead of
  snapshot_id-only check, preventing erroneous deletion of statistics files
  shared across snapshots.

- Remove goto anti-pattern: Extract ManifestFile lookup into
  MakeManifestReader() helper and use manifest_cache_ for direct lookup.

- Improve API: FindDataFilesToDelete now returns
  Result<unordered_set<string>> instead of using a mutable out-parameter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mirror Java's file cleanup class hierarchy for expire snapshots:
- Add abstract FileCleanupStrategy with shared DeleteFile() and
  ExpiredStatisticsFilePaths() utilities (path-based set difference)
- Add ReachableFileCleanup concrete class owning manifest_cache_,
  ReadManifestsForSnapshot(), and FindDataFilesToDelete()
- Move MakeManifestReader() to a free function in anonymous namespace
  using ICEBERG_ASSIGN_OR_RAISE
- Remove cleanup-specific private methods and manifest_cache_ from
  ExpireSnapshots class; Finalize() now delegates to the strategy
- Clear apply_result_ after consumption in Finalize()
- Rename DeleteFilePath to DeleteFile; use std::ignore for FileIO return
- Remove manifest_list.h and manifest_reader.h from the header
… stats file deletion

P0: ReadManifestsForSnapshot now returns bool. If any retained snapshot's
manifest list cannot be read, phases 2-4 (manifest and data file deletion)
are skipped entirely. An incomplete retained set makes it unsafe to compute
manifests_to_delete, as manifests still referenced by unreadable snapshots
would be wrongly included. This matches Java's throwFailureWhenFinished
behavior in ReachableFileCleanup. Manifest list deletion (phase 5) is
unaffected since it is keyed on expired snapshots only.

P1: Remove physical statistics and partition-statistics file deletion (the
former phase 6). RemoveStatistics/RemovePartitionStatistics are still not
called in RemoveSnapshots (the TODO in table_metadata.cc), so the committed
metadata still references those files after they would be deleted on disk.
Deletion is deferred until the metadata-level removal is wired in, at which
point the two operations can be kept in sync.
…hots::Finalize

Mirrors Java's IncrementalFileCleanup for the linear-ancestry case: each
manifest is attributed to its writer snapshot, so two passes are enough
instead of the full reachability scan. Cherry-pick protection via
SnapshotSummaryFields::kSourceSnapshotId is preserved.

Finalize() now picks IncrementalFileCleanup when the expiration is
"simple" (no explicit snapshot IDs, no removed snapshots outside the
current main ancestry, and no retained snapshots outside the current
main ancestry), and falls back to ReachableFileCleanup otherwise. The
dispatch matches Java RemoveSnapshots.cleanExpiredSnapshots.

Two existing cleanup tests (DeletesExpiredFiles,
IgnoresExpiredDeleteManifestReadFailures) used an empty current
manifest list, which is an unreachable-orphan scenario that only
ReachableFileCleanup can resolve. They now call ExpireSnapshotId() to
force the reachable path, which keeps their original intent and
matches Java behavior. New tests cover both dispatch branches.
The merge of main into pr-a-incremental-cleanup auto-resolved by
keeping both copies in two places:
  * transaction.cc: a duplicate Result<const TableMetadata*>
    finalize_result definition, which made the file fail to compile.
  * expire_snapshots.cc: an orphaned ReachableFileCleanup direct call
    plus the obsolete TODO comment after the now-correct dispatch
    inside Finalize(), which made clang-format fail in CI.

Drop the duplicates so the file compiles and matches the intended
post-merge state.
Comment thread src/iceberg/update/expire_snapshots.cc Outdated
Java's IncrementalFileCleanup propagates the NumberFormatException when
source-snapshot-id can't be parsed, so cherry-pick protection cannot be
silently bypassed. Mirror that behavior by returning InvalidArgument from
both parse sites instead of skipping the entry.

Addresses wgtmac review feedback on PR apache#648.
Comment thread src/iceberg/update/expire_snapshots.cc Outdated
Comment thread src/iceberg/update/expire_snapshots.cc Outdated
Comment thread src/iceberg/update/expire_snapshots.cc Outdated
Comment thread src/iceberg/update/expire_snapshots.cc Outdated
Comment thread src/iceberg/test/expire_snapshots_test.cc Outdated
Comment thread src/iceberg/update/expire_snapshots.cc Outdated
- Replace std::stoll with StringUtils::ParseNumber<int64_t> + propagate via
  ICEBERG_ASSIGN_OR_RAISE in IncrementalFileCleanup cherry-pick parsing.
- Derive expired snapshot IDs inside each strategy from before/after metadata
  instead of accepting a separately-passed set that can drift.
- Inline strategy dispatch in ExpireSnapshots::Finalize -- stack-construct the
  selected strategy and call CleanFiles directly, no unique_ptr.
- Propagate Finalize() status from PendingUpdate::Commit() so parsing errors
  surface instead of being silently dropped.
- Add commit-path test CommitPropagatesMalformedSourceSnapshotId.
- Drop "Mirrors Java ..." phrasing from production comments and trim the
  test comment on IncrementalDispatchPreservesAncestorAddedFiles.
Comment thread src/iceberg/update/expire_snapshots.cc
Comment thread src/iceberg/update/pending_update.cc Outdated
Comment thread src/iceberg/test/expire_snapshots_test.cc
@wgtmac wgtmac force-pushed the pr-a-incremental-cleanup branch 3 times, most recently from 5955db8 to a56f47e Compare May 20, 2026 14:29
Copy link
Copy Markdown
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! It is a very complex algorithm that I spent some time to understand it and make sure it aligns with the Java parity. I think now it looks good to me and it still has some followups to make it work as expected.

@wgtmac wgtmac force-pushed the pr-a-incremental-cleanup branch from a56f47e to 47dfa13 Compare May 20, 2026 14:53
@wgtmac wgtmac merged commit 3e7b20a into apache:main May 20, 2026
15 checks passed
shangxinli added a commit to shangxinli/iceberg-cpp that referenced this pull request May 21, 2026
Resolve conflicts with upstream IncrementalFileCleanup (apache#648):

- expire_snapshots.cc:
  * Keep both <thread> (for retry sleep) and add string_util.h
    (needed by IncrementalFileCleanup::StringUtils::ParseNumber).
  * Keep thread_pool_internal.h for the parallel-delete worker pool.
  * Combine the PR's bounded-retry DeleteFile/parallel DeleteFiles with
    upstream's new ExpiredSnapshotIds helper -- both are needed since
    IncrementalFileCleanup and the new ReachableFileCleanup signature
    rely on the helper, while the retry+pool path preserves this PR's
    hardening intent.

- expire_snapshots_test.cc:
  * Two reachable-path tests now need an explicit ExpireSnapshotId()
    call to bypass the new incremental dispatch (matches upstream apache#648).
  * Keep the mutex-protected DeleteWith callback added by this PR --
    required because the strategy's worker pool invokes the callback
    from multiple threads.

All 340 table_update_test cases and 7 ThreadPoolTest cases pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants